NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Cooperative Knowledge Distillation: A Learner Agnostic Approach

https://doi.org/10.1609/aaai.v38i13.29322

Livanos, Michael; Davidson, Ian; Wong, Stephen (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

Knowledge distillation is a simple but powerful way to transfer knowledge between a teacher model to a student model. Existing work suffers from at least one of the following key limitations in terms of direction and scope of transfer which restrict its use: all knowledge is transferred from teacher to student regardless of whether or not that knowledge is useful, the student is the only one learning in this exchange, and typically distillation transfers knowledge only from a single teacher to a single student. We formulate a novel form of knowledge distillation in which many models can act as both students and teachers which we call cooperative distillation. The models cooperate as follows: a model (the student) identifies specific deficiencies in it's performance and searches for another model (the teacher) who encodes learned knowledge into instructional virtual instances via counterfactual instance generation. Because different models may have different strengths and weaknesses, all models can act as either students or teachers (cooperation) when appropriate and only distill knowledge in areas specific to their strengths (focus). Since counterfactuals as a paradigm are not tied to any specific algorithm, we can use this method to distill knowledge between learners of different architectures, algorithms, and even feature spaces. We demonstrate our approach not only outperforms baselines such as transfer learning, self-supervised learning, and multiple knowledge distillation algorithms on several datasets, but it can also be used in settings where the aforementioned techniques cannot.
more » « less
Full Text Available
MUSTANG: Multi-sample spatial transcriptomics data analysis with cross-sample transcriptional similarity guidance

https://doi.org/10.1016/j.patter.2024.100986

Niyakan, Seyednami; Sheng, Jianting; Cao, Yuliang; Zhang, Xiang; Xu, Zhan; Wu, Ling; Wong, Stephen TC; Qian, Xiaoning (May 2024, Patterns)

Full Text Available
Addressing the genetic/nongenetic duality in cancer with systems biology

https://doi.org/10.1016/j.trecan.2022.12.004

Kulkarni, Prakash; Wiley, H. Steven; Levine, Herbert; Sauro, Herbert; Anderson, Alexander; Wong, Stephen T.C.; Meyer, Aaron S.; Iyengar, Puneeth; Corlette, Kevin; Swanson, Kristin; et al (March 2023, Trends in Cancer)

Full Text Available
Adaptive Privacy Preserving Deep Learning Algorithms for Medical Data

Zhang, Xinyue; Ding, Jiahao; Wu, Maoqiang; Wong, Stephen TC.; Nguyen, Hien V.; Pan, Miao (January 2021, IEEE Winter Conference on Applications of Computer Vision)
null (Ed.)
Deep learning holds a great promise of revolutionizing healthcare and medicine. Unfortunately, various inference attack models demonstrated that deep learning puts sensitive patient information at risk. The high capacity of deep neural networks is the main reason behind the privacy loss. In particular, patient information in the training data can be unintentionally memorized by a deep network. Adversarial parties can extract that information given the ability to access or query the network. In this paper, we propose a novel privacy-preserving mechanism for training deep neural networks. Our approach adds decaying Gaussian noise to the gradients at every training iteration. This is in contrast to the mainstream approach adopted by Google's TensorFlow Privacy, which employs the same noise scale in each step of the whole training process. Compared to existing methods, our proposed approach provides an explicit closed-form mathematical expression to approximately estimate the privacy loss. It is easy to compute and can be useful when the users would like to decide proper training time, noise scale, and sampling ratio during the planning phase. We provide extensive experimental results using one real-world medical dataset (chest radiographs from the CheXpert dataset) to validate the effectiveness of the proposed approach. The proposed differential privacy based deep learning model achieves significantly higher classification accuracy over the existing methods with the same privacy budget.
more » « less
Full Text Available
Memory-Augmented Capsule Network for Adaptable Lung Nodule Classification

https://doi.org/10.1109/TMI.2021.3051089

Mobiny, Aryan; Yuan, Pengyu; Cicalese, Pietro A.; Moulik, Supratik K.; Garg, Naveen; Wu, Carol C.; Wong, Kelvin; Wong, Stephen T.; He, Tian Cheng; Nguyen, Hien V. (January 2021, IEEE Transactions on Medical Imaging)
null (Ed.)
Computer-aided diagnosis (CAD) systems must constantly cope with the perpetual changes in data distribution caused by different sensing technologies, imaging protocols, and patient populations. Adapting these systems to new domains often requires significant amounts of labeled data for re-training. This process is labor-intensive and time-consuming. We propose a memory-augmented capsule network for the rapid adaptation of CAD models to new domains. It consists of a capsule network that is meant to extract feature embeddings from some high-dimensional input, and a memory-augmented task network meant to exploit its stored knowledge from the target domains. Our network is able to efficiently adapt to unseen domains using only a few annotated samples. We evaluate our method using a large-scale public lung nodule dataset (LUNA), coupled with our own collected lung nodules and incidental lung nodules datasets. When trained on the LUNA dataset, our network requires only 30 additional samples from our collected lung nodule and incidental lung nodule datasets to achieve clinically relevant performance (0.925 and 0.891 area under receiving operating characteristic curves (AUROC), respectively). This result is equivalent to using two orders of magnitude less labeled training data while achieving the same performance. We further evaluate our method by introducing heavy noise, artifacts, and adversarial attacks. Under these severe conditions, our network’s AUROC remains above 0.7 while the performance of state-of-the-art approaches reduce to chance level
more » « less
Full Text Available
Differential Contributions of Pre- and Post-EMT Tumor Cells in Breast Cancer Metastasis

https://doi.org/10.1158/0008-5472.CAN-19-1427

Lourenco, Ana Rita; Ban, Yi; Crowley, Michael J.; Lee, Sharrell B.; Ramchandani, Divya; Du, Wei; Elemento, Olivier; George, Jason T.; Jolly, Mohit Kumar; Levine, Herbert; et al (January 2020, Cancer Research)

Full Text Available

Search for: All records